Automated Generation of Category-Specific Thesauri for Interactive Query Expansion⋆

نویسنده

  • Fabrizio Sebastiani
چکیده

The categorisation of documents into subject-specific categories is a useful enhancement for large document collections addressed by information retrieval (IR) systems, as a user can first browse a category tree in search of the category that best matches her interests, and then issue a query for more specific documents “from within the category”. This approach combines two modalities in information seeking that are most popular in Web-based search engines, i.e. category-based site browsing (as exemplified by e.g. Yahoo!) and keyword-based document querying (as exemplified by e.g. AltaVista). In the framework of the Eurosearch Project, we are addressing the problem of the automatic categorisation of Web documents and sites within Yahoo!-like hierarchies of categories [1,4]. The tools resulting from this project allow to overcome a major bottleneck in today’s Web information organisation, i.e. the need for manual categorisation of Web documents and sites; this latter modality is inadequate, in view of the ever increasing size of the Web and of its ever evolving content.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interactive Query Expansion with Automatically Generated Category-Specific Thesauri∗

The categorization of documents into subjectspecific categories is a useful enhancement for large document collections addressed by information retrieval systems, as a user can first browse a category tree in search of the category that best matches her interests, and then issue a query for more specific documents “from within the category”. This approach combines two modalities in information ...

متن کامل

Creation and Maintenance of Query Expansion Rules

In an information retrieval system, a thesaurus can be used for query expansion, i.e. adding words to queries in order to improve recall. We propose a semi-automatic and interactive approach for the creation and maintenance of domain-specific thesauri for query expansion. Domain-specific thesauri are especially required in highly technical domains where the use of general thesauri for query exp...

متن کامل

User Comprehension and Searching with Information Retrieval Thesauri

While information retrieval thesauri may improve search results, there is little research documenting whether general information system users employ these vocabulary tools. This article explores user comprehension and searching with thesauri. Data were gathered as part of a larger empirical query-expansion study involving the ProQuest‚ Controlled Vocabulary. The results suggest that users’ kno...

متن کامل

Thesaurus-assisted search term selection and query expansion: A review of user-centred studies

This paper provides a review of the literature related to the application of domain-specific thesauri in the search and retrieval process. Focusing on studies which adopt a usercentred approach, the review presents a survey of the methodologies and results from empirical studies undertaken on the use of thesauri as sources of term selection for query formulation and expansion during the search ...

متن کامل

Associative and Spatial Relationships in Thesaurus-Based Retrieval

The OASIS (Ontologically Augmented Spatial Information System) project explores terminology systems for thematic and spatial access in digital library applications. A prototype implementation uses data from the Royal Commission on the Ancient and Historical Monuments of Scotland, together with the Getty AAT and TGN thesauri. This paper describes its integrated spatial and thematic schema and di...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998